{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center><img src=\"https://raw.githubusercontent.com/openvinotoolkit/anomalib/main/docs/source/images/logos/anomalib-wide-blue.png\" alt=\"Paris\" class=\"center\"></center>\n",
    "\n",
    "<center>💙 A library for benchmarking, developing and deploying deep learning anomaly detection algorithms</center>\n",
    "\n",
    "______________________________________________________________________\n",
    "\n",
    "> NOTE:\n",
    "> This notebook is originally created by @innat on [Kaggle](https://www.kaggle.com/code/ipythonx/mvtec-ad-anomaly-detection-with-anomalib-library/notebook).\n",
    "\n",
    "[Anomalib](https://github.com/openvinotoolkit/anomalib): Anomalib is a deep learning library that aims to collect state-of-the-art anomaly detection algorithms for benchmarking on both public and private datasets. Anomalib provides several ready-to-use implementations of anomaly detection algorithms described in the recent literature, as well as a set of tools that facilitate the development and implementation of custom models. The library has a strong focus on image-based anomaly detection, where the goal of the algorithm is to identify anomalous images, or anomalous pixel regions within images in a dataset.\n",
    "\n",
    "The library supports [`MVTec AD`](https://www.mvtec.com/company/research/datasets/mvtec-ad) (CC BY-NC-SA 4.0) and [`BeanTech`](https://paperswithcode.com/dataset/btad) (CC-BY-SA) for **benchmarking** and `folder` for custom dataset **training/inference**. In this notebook, we will explore `anomalib` training a PADIM model on the `MVTec AD` bottle dataset and evaluating the model's performance. The sections in this notebook explores the steps in `tools/train.py` more in detail. Those who would like to reproduce the results via CLI could use `python tools/train.py --model padim`."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installing Anomalib"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Installation can be done in two ways: (i) install via PyPI, or (ii) installing from sourc, both of which are shown below:"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### I. Install via PyPI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Option - I: Uncomment the next line if you want to install via pip.\n",
    "# %pip install anomalib"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### II. Install from Source\n",
    "This option would initially download anomalib repository from github and manually install `anomalib` from source, which is shown below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Option - II: Uncomment the next three lines if you want to install from the source.\n",
    "# !git clone https://github.com/openvinotoolkit/anomalib.git\n",
    "# %cd anomalib\n",
    "# %pip install ."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " Now let's verify the working directory. This is to access the datasets and configs when the notebook is run from different platforms such as local or Google Colab."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from __future__ import annotations\n",
    "\n",
    "import os\n",
    "from pathlib import Path\n",
    "from typing import Any\n",
    "\n",
    "from git.repo import Repo\n",
    "\n",
    "current_directory = Path.cwd()\n",
    "if current_directory.name == \"000_getting_started\":\n",
    "    # On the assumption that, the notebook is located in\n",
    "    #   ~/anomalib/notebooks/000_getting_started/\n",
    "    root_directory = current_directory.parent.parent\n",
    "elif current_directory.name == \"anomalib\":\n",
    "    # This means that the notebook is run from the main anomalib directory.\n",
    "    root_directory = current_directory\n",
    "else:\n",
    "    # Otherwise, we'll need to clone the anomalib repo to the `current_directory`\n",
    "    repo = Repo.clone_from(url=\"https://github.com/openvinotoolkit/anomalib.git\", to_path=current_directory)\n",
    "    root_directory = current_directory / \"anomalib\"\n",
    "\n",
    "os.chdir(root_directory)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from IPython.display import display\n",
    "from PIL import Image\n",
    "from pytorch_lightning import Trainer\n",
    "from torchvision.transforms import ToPILImage\n",
    "\n",
    "from anomalib.config import get_configurable_parameters\n",
    "from anomalib.data import get_datamodule\n",
    "from anomalib.models import get_model\n",
    "from anomalib.pre_processing.transforms import Denormalize\n",
    "from anomalib.utils.callbacks import LoadModelCallback, get_callbacks"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model\n",
    "\n",
    "Currently, there are **13** anomaly detection models available in `anomalib` library. Namely,\n",
    "\n",
    "- [CFA](https://arxiv.org/abs/2206.04325)\n",
    "- [CS-Flow](https://arxiv.org/abs/2110.02855v1)\n",
    "- [CFlow](https://arxiv.org/pdf/2107.12571v1.pdf)\n",
    "- [DFKDE](https://github.com/openvinotoolkit/anomalib/tree/main/anomalib/models/dfkde)\n",
    "- [DFM](https://arxiv.org/pdf/1909.11786.pdf)\n",
    "- [DRAEM](https://arxiv.org/abs/2108.07610)\n",
    "- [FastFlow](https://arxiv.org/abs/2111.07677)\n",
    "- [Ganomaly](https://arxiv.org/abs/1805.06725)\n",
    "- [Padim](https://arxiv.org/pdf/2011.08785.pdf)\n",
    "- [Patchcore](https://arxiv.org/pdf/2106.08265.pdf)\n",
    "- [Reverse Distillation](https://arxiv.org/abs/2201.10703)\n",
    "- [R-KDE](https://ieeexplore.ieee.org/document/8999287)\n",
    "- [STFPM](https://arxiv.org/pdf/2103.04257.pdf)\n",
    "\n",
    "In this tutorial, we'll be using Padim. Now, let's get their config paths from the respected folders."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configuration\n",
    "\n",
    "In this demonstration, we will choose [Padim](https://arxiv.org/pdf/2011.08785.pdf) model from the above list. Let's take a quick look at its config file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "MODEL = \"padim\"  # 'padim', 'cflow', 'stfpm', 'ganomaly', 'dfkde', 'patchcore'\n",
    "CONFIG_PATH = root_directory / f\"anomalib/models/{MODEL}/config.yaml\"\n",
    "with open(file=CONFIG_PATH, mode=\"r\", encoding=\"utf-8\") as file:\n",
    "    print(file.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could use [get_configurable_parameter](https://github.com/openvinotoolkit/anomalib/blob/main/anomalib/config/config.py#L114) function to read the configs from the path and return them in a dictionary. We use the default config file that comes with Padim implementation, which uses `./datasets/MVTec` as the path to the dataset. We need to overwrite this after loading the config."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# pass the config file to model, callbacks and datamodule\n",
    "config = get_configurable_parameters(config_path=CONFIG_PATH)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dataset: MVTec AD\n",
    "\n",
    "**MVTec AD** is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection. It contains over **5000** high-resolution images divided into **15** different object and texture categories. Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects. If the dataset is not located in the root datasets directory, anomalib will automatically install the dataset.\n",
    "\n",
    "We could now import the MVtec AD dataset using its specific datamodule implemented in anomalib."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "datamodule = get_datamodule(config)\n",
    "datamodule.setup()  # Downloads the dataset if it's not in the specified `root` directory\n",
    "datamodule.prepare_data()  # Create train/val/test/prediction sets.\n",
    "\n",
    "i, data = next(enumerate(datamodule.val_dataloader()))\n",
    "print(data.keys())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check the shapes of the input images and masks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(data[\"image\"].shape, data[\"mask\"].shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could now visualize a normal and abnormal sample from the validation set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def show_image_and_mask(sample: dict[str, Any], index: int) -> Image:\n",
    "    img = ToPILImage()(Denormalize()(sample[\"image\"][index].clone()))\n",
    "    msk = ToPILImage()(sample[\"mask\"][index]).convert(\"RGB\")\n",
    "\n",
    "    return Image.fromarray(np.hstack((np.array(img), np.array(msk))))\n",
    "\n",
    "\n",
    "# Visualize a normal image-mask\n",
    "show_image_and_mask(data, index=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Visualize an abnormal image-mask\n",
    "show_image_and_mask(data, index=20)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare Model and Callbacks\n",
    "\n",
    "Now, the config file is updated as we want. We can now start model training with it. Here we will be using `datamodule`, `model` and `callbacks` to train the model. Callbacks are self-contained objects, which contains non-essential logic. This way we could inject as many callbacks as possible such as ModelLoading, Timer, Metrics, Normalization and Visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = get_model(config)\n",
    "callbacks = get_callbacks(config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# start training\n",
    "trainer = Trainer(**config.trainer, callbacks=callbacks)\n",
    "trainer.fit(model=model, datamodule=datamodule)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Validation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# load best model from checkpoint before evaluating\n",
    "load_model_callback = LoadModelCallback(weights_path=trainer.checkpoint_callback.best_model_path)\n",
    "trainer.callbacks.insert(0, load_model_callback)\n",
    "trainer.test(model=model, datamodule=datamodule)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default, the output images are saved into `results` directory. We could get the output filenames from the directory, read the saved the images and visualize here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(config[\"project\"][\"path\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "image_filenames = list(Path(config[\"project\"][\"path\"]).glob(\"**/*.png\"))\n",
    "print(image_filenames[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for filename in image_filenames:\n",
    "    image = Image.open(filename)\n",
    "    display(image)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "anomalib",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13 (default, Nov  6 2022, 23:15:27) \n[GCC 9.3.0]"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "ae223df28f60859a2f400fae8b3a1034248e0a469f5599fd9a89c32908ed7a84"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
